PostgreSQL 数据库性能 分区表哈希智能算法

1 背景知识

对于两个大分区表的Hash 连接,性能与hash table的大小直接相关。
1、如果hash table数据量能够被work_mem缓存,那性能是最优的。
2、如果能够利用分区的特点按分区进行hash 链接,那对于性能是非常大的提升。
3、PostgreSQL有两个参数 enable_partitionwise_joinenable_partitionwise_aggregate 用于分区智能连接和聚合。
4、默认这两个参数是 off 的。
5、开启这两个参数,对于采用hash join 或 hash aggregate 的执行方式有性能提升。

以下我们举例来看该参数对于执行计划的影响。

2 环境准备

DROP TABLE t01;
CREATE TABLE t01(id1 integer,name1 text) PARTITION BY HASH(id1) ;
CREATE TABLE t01_1 PARTITION OF t01  FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE t01_2 PARTITION OF t01  FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE t01_3 PARTITION OF t01  FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE t01_4 PARTITION OF t01 FOR VALUES WITH (MODULUS 4, REMAINDER 3);

INSERT INTO t01 SELECT generate_series(1,10000000),'abc'||generate_series(1,10000000);
DROP TABLE t02;
CREATE TABLE t02(id2 integer,name1 text) PARTITION BY HASH(id2) ;
CREATE TABLE t02_1 PARTITION OF t02   FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE t02_2 PARTITION OF t02  FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE t02_3 PARTITION OF t02  FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE t02_4 PARTITION OF t02 FOR VALUES WITH (MODULUS 4, REMAINDER 3);
INSERT INTO t02 SELECT generate_series(1,3000000),'abc'||generate_series(1,3000000);



3 分区智能连接

3.1 关闭分区智能连接

SHOW enable_partitionwise_join ;
//屏幕输出:
 enable_partitionwise_join 
---------------------------
 off
(1 row)

3.2 查看执行计划

1、当前 work_mem 为4M 时。无法缓存所有的 HASH 数据。
2、 Batches: 32 是指 HASH 缓存了32次。都被缓存在临时表中。效率并不高。

EXPLAIN ANALYZE SELECT id1 , count(*) FROM t01 GROUP BY id1;
//屏幕输出:
 QUERY PLAN                                                             
-------------------------------------------
 Finalize Aggregate  (cost=1067929074.38..1067929074.39 rows=1 width=8) (actual time=2135.213..2248.708 rows=1 loops=1)
   ->  Gather  (cost=1067929074.16..1067929074.37 rows=2 width=8) (actual time=2133.141..2248.691 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=1067928074.16..1067928074.17 rows=1 width=8) (actual time=2113.254..2113.444 rows=1 loops=3)
               ->  Parallel Hash Join  (cost=55477.00..911678074.16 rows=62500000000 width=0) (actual time=1583.112..2074.008 rows=1000000 loops=3)
                     Hash Cond: (t01.id1 = t02.id2)
                     ->  Parallel Append  (cost=0.00..116556.00 rows=4166666 width=4) (actual time=0.087..620.678 rows=3333333 loops=3)
                           ->  Parallel Seq Scan on t01_4  (cost=0.00..23946.36 rows=1042336 width=4) (actual time=0.076..307.641 rows=2501606 loops=1)
                           ->  Parallel Seq Scan on t01_2  (cost=0.00..23934.28 rows=1041828 width=4) (actual time=0.070..302.509 rows=2500388 loops=1)
                           ->  Parallel Seq Scan on t01_1  (cost=0.00..23923.50 rows=1041350 width=4) (actual time=0.058..101.870 rows=833080 loops=3)
                           ->  Parallel Seq Scan on t01_3  (cost=0.00..23918.52 rows=1041152 width=4) (actual time=0.112..309.827 rows=2498765 loops=1)
                     ->  Parallel Hash  (cost=34969.00..34969.00 rows=1250000 width=4) (actual time=496.988..496.990 rows=1000000 loops=3)
                           Buckets: 262144  Batches: 32  Memory Usage: 5760kB
                           ->  Parallel Append  (cost=0.00..34969.00 rows=1250000 width=4) (actual time=178.960..354.000 rows=1000000 loops=3)
                                 ->  Parallel Seq Scan on t02_4  (cost=0.00..7190.62 rows=312962 width=4) (actual time=197.575..285.326 rows=751110 loops=
1)
                                 ->  Parallel Seq Scan on t02_2  (cost=0.00..7188.90 rows=312890 width=4) (actual time=197.273..278.184 rows=750935 loops=
1)
                                 ->  Parallel Seq Scan on t02_1  (cost=0.00..7169.82 rows=312082 width=4) (actual time=0.061..30.200 rows=249665 loops=3)
                                 ->  Parallel Seq Scan on t02_3  (cost=0.00..7169.66 rows=312066 width=4) (actual time=142.029..217.814 rows=748959 loops=
1)
 Planning Time: 0.206 ms
 JIT:
   Functions: 71
   Options: Inlining true, Optimization true, Expressions true, Deforming true
   Timing: Generation 3.574 ms, Inlining 152.357 ms, Optimization 234.494 ms, Emission 150.277 ms, Total 540.702 ms
 Execution Time: 2249.942 ms
(25 rows)

3.3 开启分区智能连接

SET enable_partitionwise_join=on;

3.4 查看执行计划

因为hash 分区,分区列的数据类型与分区数量相同,能够保证相同的数据hash到对应的分区。

1、由于单个分区的数据量更少,更有可能使用内存排序。
2、batches : 8,缓存次数变少了。

EXPLAIN ANALYZE SELECT count(*) FROM t01,t02 WHERE id1=id2;
//屏幕输出:
 QUERY PLAN                                                             
-----------
 Finalize Aggregate  (cost=213601.21..213601.22 rows=1 width=8) (actual time=1844.304..1943.383 rows=1 loops=1)
   ->  Gather  (cost=213600.99..213601.20 rows=2 width=8) (actual time=1841.401..1943.349 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Partial Aggregate  (cost=212600.99..212601.00 rows=1 width=8) (actual time=1820.105..1820.114 rows=1 loops=3)
               ->  Parallel Append  (cost=12323.02..209475.99 rows=1250000 width=0) (actual time=867.670..1776.608 rows=1000000 loops=3)
                     ->  Parallel Hash Join  (cost=12325.65..50851.78 rows=312962 width=0) (actual time=862.010..1267.260 rows=751110 loops=1)
                           Hash Cond: (t01_4.id1 = t02_4.id2)
                           ->  Parallel Seq Scan on t01_4  (cost=0.00..23946.36 rows=1042336 width=4) (actual time=0.080..284.806 rows=2501606 loops=1)
                           ->  Parallel Hash  (cost=7190.62..7190.62 rows=312962 width=4) (actual time=244.417..244.418 rows=751110 loops=1)
                                 Buckets: 262144  Batches: 8  Memory Usage: 5760kB
                                 ->  Parallel Seq Scan on t02_4  (cost=0.00..7190.62 rows=312962 width=4) (actual time=18.441..125.767 rows=751110 loops=1
)
                     ->  Parallel Hash Join  (cost=12323.02..50830.87 rows=312890 width=0) (actual time=875.893..1289.311 rows=750935 loops=1)
                           Hash Cond: (t01_2.id1 = t02_2.id2)
                           ->  Parallel Seq Scan on t01_2  (cost=0.00..23934.28 rows=1041828 width=4) (actual time=0.071..291.676 rows=2500388 loops=1)
                           ->  Parallel Hash  (cost=7188.90..7188.90 rows=312890 width=4) (actual time=250.357..250.358 rows=750935 loops=1)
                                 Buckets: 262144  Batches: 8  Memory Usage: 5760kB
                                 ->  Parallel Seq Scan on t02_2  (cost=0.00..7188.90 rows=312890 width=4) (actual time=17.412..131.929 rows=750935 loops=1
)
                     ->  Parallel Hash Join  (cost=12290.84..50775.75 rows=312082 width=0) (actual time=292.223..440.440 rows=249665 loops=3)
                           Hash Cond: (t01_1.id1 = t02_1.id2)
                           ->  Parallel Seq Scan on t01_1  (cost=0.00..23923.50 rows=1041350 width=4) (actual time=0.057..98.739 rows=833080 loops=3)
                           ->  Parallel Hash  (cost=7169.82..7169.82 rows=312082 width=4) (actual time=72.022..72.022 rows=249665 loops=3)
                                 Buckets: 262144  Batches: 8  Memory Usage: 5728kB
                                 ->  Parallel Seq Scan on t02_1  (cost=0.00..7169.82 rows=312082 width=4) (actual time=0.099..34.730 rows=249665 loops=3)
                     ->  Parallel Hash Join  (cost=12290.49..50767.60 rows=312066 width=0) (actual time=865.103..1257.990 rows=748959 loops=1)
                           Hash Cond: (t01_3.id1 = t02_3.id2)
                           ->  Parallel Seq Scan on t01_3  (cost=0.00..23918.52 rows=1041152 width=4) (actual time=0.078..286.559 rows=2498765 loops=1)
                           ->  Parallel Hash  (cost=7169.66..7169.66 rows=312066 width=4) (actual time=247.173..247.174 rows=748959 loops=1)
                                 Buckets: 262144  Batches: 8  Memory Usage: 5728kB
                                 ->  Parallel Seq Scan on t02_3  (cost=0.00..7169.66 rows=312066 width=4) (actual time=19.626..125.604 rows=748959 loops=1
)
 Planning Time: 1.897 ms
 JIT:
   Functions: 116
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 5.283 ms, Inlining 0.000 ms, Optimization 2.231 ms, Emission 53.717 ms, Total 61.231 ms
 Execution Time: 1969.033 ms
(36 rows)

4 分区聚合

SET enable_partitionwise_aggregate=on;
EXPLAIN ANALYZE SELECT count(*) FROM t01,t02 WHERE id1=id2;
//屏幕输出:

postgres=# EXPLAIN ANALYZE SELECT count(*) FROM t01,t02 WHERE id1=id2;
                                                                            QUERY PLAN                                                                    
        
----------------------------------------------------------------------------------------------------------------------------------------------------------
--------
 Finalize Aggregate  (cost=207351.87..207351.88 rows=1 width=8) (actual time=1729.958..1832.467 rows=1 loops=1)
   ->  Gather  (cost=52613.09..207351.85 rows=8 width=8) (actual time=1273.784..1832.442 rows=6 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Parallel Append  (cost=51613.09..206351.05 rows=4 width=8) (actual time=1281.861..1705.475 rows=2 loops=3)
               ->  Partial Aggregate  (cost=51634.18..51634.19 rows=1 width=8) (actual time=1286.699..1286.702 rows=1 loops=1)
                     ->  Parallel Hash Join  (cost=12325.65..50851.78 rows=312962 width=0) (actual time=863.081..1252.165 rows=751110 loops=1)
                           Hash Cond: (t01_3.id1 = t02_3.id2)
                           ->  Parallel Seq Scan on t01_4 t01_3  (cost=0.00..23946.36 rows=1042336 width=4) (actual time=0.072..280.515 rows=2501606 loops
=1)
                           ->  Parallel Hash  (cost=7190.62..7190.62 rows=312962 width=4) (actual time=255.924..255.924 rows=751110 loops=1)
                                 Buckets: 262144  Batches: 8  Memory Usage: 5760kB
                                 ->  Parallel Seq Scan on t02_4 t02_3  (cost=0.00..7190.62 rows=312962 width=4) (actual time=20.122..137.145 rows=751110 l
oops=1)
               ->  Partial Aggregate  (cost=51613.09..51613.10 rows=1 width=8) (actual time=1285.761..1285.764 rows=1 loops=1)
                     ->  Parallel Hash Join  (cost=12323.02..50830.87 rows=312890 width=0) (actual time=868.068..1253.114 rows=750935 loops=1)
                           Hash Cond: (t01_1.id1 = t02_1.id2)
                           ->  Parallel Seq Scan on t01_2 t01_1  (cost=0.00..23934.28 rows=1041828 width=4) (actual time=0.060..280.214 rows=2500388 loops
=1)
                           ->  Parallel Hash  (cost=7188.90..7188.90 rows=312890 width=4) (actual time=260.389..260.389 rows=750935 loops=1)
                                 Buckets: 262144  Batches: 8  Memory Usage: 5760kB
                                 ->  Parallel Seq Scan on t02_2 t02_1  (cost=0.00..7188.90 rows=312890 width=4) (actual time=20.181..139.650 rows=750935 l
oops=1)
               ->  Partial Aggregate  (cost=51555.95..51555.96 rows=1 width=8) (actual time=423.604..423.606 rows=1 loops=3)
                     ->  Parallel Hash Join  (cost=12290.84..50775.75 rows=312082 width=0) (actual time=287.110..412.499 rows=249665 loops=3)
                           Hash Cond: (t01.id1 = t02.id2)
                           ->  Parallel Seq Scan on t01_1 t01  (cost=0.00..23923.50 rows=1041350 width=4) (actual time=0.057..96.679 rows=833080 loops=3)
                           ->  Parallel Hash  (cost=7169.82..7169.82 rows=312082 width=4) (actual time=71.926..71.927 rows=249665 loops=3)
                                 Buckets: 262144  Batches: 8  Memory Usage: 5760kB
                                 ->  Parallel Seq Scan on t02_1 t02  (cost=0.00..7169.82 rows=312082 width=4) (actual time=0.069..35.350 rows=249665 loops
=3)
               ->  Partial Aggregate  (cost=51547.76..51547.77 rows=1 width=8) (actual time=1273.119..1273.121 rows=1 loops=1)
                     ->  Parallel Hash Join  (cost=12290.49..50767.60 rows=312066 width=0) (actual time=861.259..1239.197 rows=748959 loops=1)
                           Hash Cond: (t01_2.id1 = t02_2.id2)
                           ->  Parallel Seq Scan on t01_3 t01_2  (cost=0.00..23918.52 rows=1041152 width=4) (actual time=0.054..283.870 rows=2498765 loops
=1)
                           ->  Parallel Hash  (cost=7169.66..7169.66 rows=312066 width=4) (actual time=250.134..250.135 rows=748959 loops=1)
                                 Buckets: 262144  Batches: 8  Memory Usage: 5728kB
                                 ->  Parallel Seq Scan on t02_3 t02_2  (cost=0.00..7169.66 rows=312066 width=4) (actual time=20.139..128.932 rows=748959 l
oops=1)
 Planning Time: 0.558 ms
 JIT:
   Functions: 134
   Options: Inlining false, Optimization false, Expressions true, Deforming true
   Timing: Generation 6.791 ms, Inlining 0.000 ms, Optimization 2.127 ms, Emission 58.685 ms, Total 67.603 ms
 Execution Time: 1835.140 ms
(39 rows)

4.1 关闭智能聚合

SHOW enable_partitionwise_aggregate;
//屏幕输出:
 enable_partitionwise_aggregate 
--------------------------------
 off
(1 row)

4.2 开启只能聚合

SET enable_partitionwise_aggregate=on;